AITopics | euclidean space

fe1ab2f77a9a0f224839cc9f1034a908-Paper-Conference.pdf

Neural Information Processing SystemsApr-30-2026, 10:24:16 GMT

Riemannian SAM: Sharpness-Aware Minimization on Riemannian Manifolds

Neural Information Processing SystemsApr-29-2026, 20:12:39 GMT

Contemporary advances in the field of deep learning have embarked upon an exploration of the underlying geometric properties of data, thus encouraging the investigation of techniques that consider general manifolds, for example, hyperbolic or orthogonal neural networks. However, the optimization algorithms for training such geometric deep models still remain highly under-explored. In this paper, we introduce Riemannian SAM by generalizing conventional Euclidean SAM to Riemannian manifolds. We successfully formulate the sharpness-aware minimization on Riemannian manifolds, leading to one of a novel instantiation, Lorentz SAM. In addition, SAM variants proposed in previous studies such as Fisher SAM can be derived as special examples under our Riemannian SAM framework. We provide the convergence analysis of Riemannian SAM under a less aggressively decaying ascent learning rate than Euclidean SAM. Our analysis serves as a theoretically sound contribution encompassing a diverse range of manifolds, also providing the guarantees for SAM variants such as Fisher SAM, whose convergence analyses are absent. Lastly, we illustrate the superiority of Riemannian SAM in terms of generalization over previous Riemannian optimization algorithms through experiments on knowledge graph completion and machine translation tasks.

artificial intelligence, gradl, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Appendix

Neural Information Processing SystemsApr-25-2026, 06:02:03 GMT

The literature for the geometric properties of Riemannian Manifolds is immense and hence we cannot hope to survey them here; for an appetizer, we refer the reader to Burago et al. [93] and Lee [94] and references therein. On the other hand, as stated, it is not until recently that the long-run non-asymptotic behavior of optimization algorithms in Riemannian manifolds (even the smooth ones) has encountered a lot of interest. For concision, we have deferred here a detailed exposition of the rest of recent results to Appendix A of the paper's supplement. Additionally, in Appendix B we also give a bunch of motivating examples which can be solved by Riemannian min-max optimization. Many application problems can be formulated as the minimization or maximization of a smooth function over Riemannian manifold and has triggered a line of research on the extension of the classical first-order and second-order methods to Riemannian setting with asymptotic convergence to first-order stationary points in general [95].

artificial intelligence, exp 1, machine learning, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

Add feedback

291d43c696d8c3704cdbe0a72ade5f6c-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 05:12:23 GMT

A.1 Broader impact Our work introduces a general method for unsupervised 3D segmentation that can be used for any 3D voxel-grid data. This line of work is especially useful for analyzing biomedical data, as many different types of biomedical data are in volumetric form and lack the ground truth annotations required for fully-or semi-supervised segmentation. For example, we may wish to study diseased tissue but do not have sufficient understanding to ensure that unexplored features of interests are labelled in training data. We illustrate the potential of our proposed approach for scientific discovery applications using our example of cryo-ET data in the Appendix. The discovered features can now be analyzed for their chemical identities and functions, in diseased vs. healthy cells.

artificial intelligence, dataset, machine learning, (19 more...)

Neural Information Processing Systems

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.55)
Health & Medicine > Diagnostic Medicine > Imaging (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

28e4ee96c94e31b2d040b4521d2b299e-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 04:56:04 GMT

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

16c628ab12dc4caca8e7712affa6c767-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 19:33:22 GMT

data mining, machine learning, manifold, (20 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > Canada (0.69)
North America > United States > California (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Communications > Social Media (0.69)
(3 more...)

Add feedback

Supplementary material for Discrete Valued Neural Communication in Structured Architectures Enhances Generalization

Neural Information Processing SystemsApr-24-2026, 18:11:44 GMT

In this appendix, as a complementary to Theorems 1-2, we provide additional theorems, Theorems 3-4, which further illustrate the two advantages of the discretization process by considering an abstract model with the discretization bottleneck. For the advantage on the sensitivity, the error due to potential noise and perturbation without discretization -- the third term ξ(w,r0,M0,d) >0 in Theorem 4 -- is shown to be minimized to zero with discretization in Theorems 3. See Appendix C.1 for a simple comparison between the bound of Theorem 3 and that of Theorem 4 when the metric spaces (M,d) and (M0,d0) are chosen to be Euclidean spaces. We now introduce the notation used in Theorems 3-4. Here, ϕw represents a deep neural network with weight parameters w W RD, qe is the discretization process with the codebook e E RL m, and hθ represents a deep neural network with parameters θ Θ Rζ. Thus, the tuple of all learnable parameters are (w,e,θ).

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

120c9ab5c58ba0fa9dd3a22ace1de245-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 16:11:19 GMT

artificial intelligence, data mining, machine learning, (14 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > California (0.46)
North America > Canada > British Columbia (0.28)
North America > United States > Oregon (0.28)

Genre: Research Report (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

Random Coordinate Descent on the Wasserstein Space of Probability Measures

Xu, Yewei, Li, Qin

arXiv.org Machine LearningApr-3-2026

Optimization over the space of probability measures endowed with the Wasserstein-2 geometry is central to modern machine learning and mean-field modeling. However, traditional methods relying on full Wasserstein gradients often suffer from high computational overhead in high-dimensional or ill-conditioned settings. We propose a randomized coordinate descent framework specifically designed for the Wasserstein manifold, introducing both Random Wasserstein Coordinate Descent (RWCD) and Random Wasserstein Coordinate Proximal{-Gradient} (RWCP) for composite objectives. By exploiting coordinate-wise structures, our methods adapt to anisotropic objective landscapes where full-gradient approaches typically struggle. We provide a rigorous convergence analysis across various landscape geometries, establishing guarantees under non-convex, Polyak-Łojasiewicz, and geodesically convex conditions. Our theoretical results mirror the classic convergence properties found in Euclidean space, revealing a compelling symmetry between coordinate descent on vectors and on probability measures. The developed techniques are inherently adaptive to the Wasserstein geometry and offer a robust analytical template that can be extended to other optimization solvers within the space of measures. Numerical experiments on ill-conditioned energies demonstrate that our framework offers significant speedups over conventional full-gradient methods.

artificial intelligence, machine learning, probability measure, (15 more...)

arXiv.org Machine Learning

2604.01606

Country: